Model-based estimation of instantaneous pitch in noisy speech
نویسندگان
چکیده
In this paper we propose a model-based approach to instantaneous pitch estimation in noisy speech, by way of incorporating pitch smoothness assumptions into the well-known harmonic model. In this approach, the latent pitch contour is modeled using a basis of smooth polynomials, and is fit to waveform data by way of a harmonic model whose partials have time-varying amplitudes. The resultant nonlinear least squares estimation task is accomplished through the Gauss-Newton method with a novel initialization step that serves to greatly increase algorithm efficiency. We demonstrate the accuracy and robustness of our method through comparisons to state-of-the art pitch estimation algorithms using both simulated and real waveform data.
منابع مشابه
On a robust F0 estimation of speech based on IRAPT using robust TV-CAR analysis
Fundamental frequency (F0) estimation is important in speech processing such as speech coding, synthesis, recognition and so on. A present F0 estimation method performs well under clean condition, however the performance deteriorates significantly in noisy environment. As a result, robust F0 estimation against additive noise is demanded. We have previously proposed F0 estimation methods based o...
متن کاملPitch estimation of noisy speech signals using empirical mode decomposition
This paper presents a pitch estimation method of noisy speech signal using empirical mode decomposition (EMD). The normalized autocorrelation function (NACF) of the noisy speech signal is decomposed into a finite set of band-limited signals termed as intrinsic mode functions (IMFs) using EMD. The periodicity of one IMF is supposed to be equal to the accurate pitch period. A conventional autocor...
متن کاملMulti-band summary correlogram-based pitch detection for noisy speech
A multi-band summary correlogram (MBSC)-based pitch detection algorithm (PDA) is proposed. The PDA performs pitch estimation and voiced/unvoiced (V/UV) detection via novel signal processing schemes that are designed to enhance the MBSC’s peaks at the most likely pitch period. These peak-enhancement schemes include comb-filter channel-weighting to yield each individual subband’s summary correlog...
متن کاملA fundamental frequency estimation method for noisy speech based on instantaneous amplitude and frequency
This paper proposes a robust and accurate F0 estimation method for noisy speech. This method uses two different principles: (1) an F0 estimation based on periodicity and harmonicity of instantaneous amplitude for a robust estimation in noisy environments, and (2) an F0 estimation based on stability of instantaneous frequency as an accurate estimation method. The proposed method also uses a comb...
متن کاملKalman tracking of linear predictor and harmonic noise models for noisy speech enhancement
This paper presents a speech enhancement method based on the tracking and denoising of the formants of a linear prediction (LP) model of the spectral envelope of speech and the parameters of a harmonic noise model (HNM) of its excitation. The main advantages of tracking and denoising the prominent energy contours of speech are the efficient use of the spectral and temporal structures of success...
متن کامل